Modern Pathology
○ Elsevier BV
Preprints posted in the last 30 days, ranked by how well they match Modern Pathology's content profile, based on 10 papers previously published here. The average preprint has a 0.07% match score for this journal, so anything above that is already an above-average fit.
Spirgath, K.; Huang, B.; Safraou, Y.; Kraftberger, M.; Dahami, M.; Kiehl, R.; Stockburger, C. H. F.; Bayerl, C.; Ludwig, J.; Jaitner, N.; Kühl, A.; Asbach, P.; Geisel, D.; Hillebrandt, K. H.; Wells, R. G.; Sack, I.; Tzschätzsch, H.
Show abstract
Background & AimsThe increasing global prevalence of metabolic dysfunction-associated steatotic liver disease (MASLD) including metabolic dysfunction-associated steatohepatitis (MASH) creates an urgent need for objective methods of histopathological assessment. Conventional histological approaches are time-consuming and rely on interpreters experience. Therefore, the results obtained may suffer from high variability and only offer coarse categorisation. In this study, we propose a fully automated, deep-learning-based pipeline for the segmentation and characterisation of histological liver features for MASH/MASLD assessment. MethodsSegmentation was applied to H&E sections from 45 mice and 44 humans with MASH/MASLD. The method, which we named qHisto (quantitative histology), utilises the nnU-Net framework and quantifies key histological components of the MASH score, including macro- and microvesicular steatosis, fibrosis, inflammation, hepatocellular ballooning and glycogenated nuclei. Additionally, we characterized the tissue using novel features that are inaccessible through manual histology, such as the distribution of fat droplet sizes, aspect ratio of nuclei and heatmaps. ResultsqHisto parameters showed strong positive correlations with conventional histology scores (fat area R=0.91, inflammation density R=0.7, ballooning density R=0.49) and also with quantitative magnetic resonance imaging (fat area vs. hepatic fat fraction R=0.87). Our novel scores showed that deformation of nuclei is driven by large fat droplets rather than the overall amount of fat. ConclusionsA key advantage of our method is spatially resolved, precise histological quantification. These features provide a finely resolved assessment of disease severity than conventional categorical scoring. By automating time-consuming and repetitive readouts, qHisto improves standardisation and reproducibility of MASH/MASLD feature quantification and provides scalable, slide-wide readouts that can support histopathologists and enhance clinical assessment and therapeutic development. Impact and ImplicationsThe proposed method provides an objective, automatic tool for comprehensive, histological liver analysis of MASH/MASLD, which can be extended to other diseases and organs. By offering classic and novel quantitative parameters and scores, our method could support histologists in their daily routines and provide researchers with further insight into steatotic liver diseases.
Shimizu, A.; Imamura, K.; Yoshimura, K.; Atsushi, T.; Sato, M.; Harada, K.
Show abstract
Drug-induced liver injury (DILI) is an acute inflammatory liver disease caused not only by prescription and over-the-counter medications but also by health foods and dietary supplements. Typically, DILI patients recover once the causative substance is identified and discontinued. In contrast, autoimmune hepatitis (AIH) results from the immune-mediated destruction of hepatocytes due to a breakdown of self-tolerance mechanisms. Patients presenting with acute-onset AIH often lack characteristic clinical features, such as autoantibodies, and require prompt steroid treatment to prevent progression to liver failure. Liver biopsy currently remains the gold standard to differentiate acute DILI from AIH; however, general pathologists face significant diagnostic challenges due to overlapping histopathological features. This study integrates pathology expertise with deep learning-based artificial intelligence (AI) to differentiate DILI from AIH using histopathological images. Our AI model demonstrates promising classification accuracy (Accuracy 74%, AUC 0.81). This paper presents a detailed pathological analysis alongside AI methods, discusses the current model performance and limitations, and proposes directions for future improvements.
Jeong, W. C.; Kim, H. H.; Hwang, Y.; Hwang, G.; Kim, K.; Ko, Y. S.
Show abstract
The Updated Sydney System (USS) provides a standardized framework for grading gastritis and stratifying gastric cancer risk. However, subjective observer variability and labor-intensive workflows impede its routine clinical use. To address these challenges, we developed SydneyMTL, a multi-task deep learning framework that uses Multiple Instance Learning (MIL) with task-specific attention pooling to predict severity grades across all five USS attributes simultaneously. Trained on an unprecedented cohort of 50,765 whole-slide images (WSIs), SydneyMTL generates interpretable histologic evidence for clinical practice. In retrospective evaluations against 24 board-certified pathologists, the model achieved an overall mean lenient accuracy of 89.1%, with 22 pathologists exhibiting >80% agreement with the model. When evaluated on an expert-adjudicated "Golden dataset," the models performance improved to 90.2%, demonstrating its capacity to align with multi-expert consensus and filter individual annotator noise. Latent space analysis confirmed that SydneyMTL captures the ordinal structure of the USS, by representing disease severity as a continuous biological spectrum rather than as disjoint categories. Finally, a randomized crossover reader study showed that AI-assisted review significantly reduced interpretation time and improved inter-observer agreement, establishing SydneyMTL as a scalable tool for supporting standardized gastric cancer risk stratification. Graphical abstract O_FIG O_LINKSMALLFIG WIDTH=154 HEIGHT=200 SRC="FIGDIR/small/26346304v1_ufig1.gif" ALT="Figure 1"> View larger version (66K): org.highwire.dtl.DTLVardef@8890daorg.highwire.dtl.DTLVardef@1de007dorg.highwire.dtl.DTLVardef@1f243d1org.highwire.dtl.DTLVardef@425eb9_HPS_FORMAT_FIGEXP M_FIG C_FIG HighlightsO_LISydneyMTL is the first unified framework to simultaneously predict the full 4-tier severity grades across all five Updated Sydney System attributes. C_LIO_LITrained on a massive cohort of 50,765 whole slide images, the model aligns with multi-expert consensus on a rigorous "Golden dataset". C_LIO_LIAI assistance significantly reduces pathologist reading time and harmonizes inter-observer variability in real-world clinical workflows. C_LIO_LILatent space analysis confirms that SydneyMTL preserves the biological ordinality of disease severity without explicit ordinal constraints. C_LI The bigger pictureGastritis is among the most frequent diagnoses in gastrointestinal pathology, and its histologic severity is central to gastric cancer prevention. In routine practice, pathologists convert subtle mucosal changes into semi-quantitative, ordinal grades using the Updated Sydney System, which evaluates five co-existing histologic dimensions. While this framework provides a shared language, grading is labor intensive and inherently dependent on reader-specific thresholds, creating variability that affects risk stratification and surveillance. A key concept motivating our study is that gastritis is not defined by a single finding but by multiple criteria that co-occur and interact. This suggests that computational models should learn these criteria jointly - capturing their biological correlations and the continuum of severity - rather than treating each grade as an isolated classification task. SydneyMTL implements this perspective through a unified multi-task, weakly supervised approach that learns directly from a massive cohort of 50,765 routine whole-slide images. Beyond diagnostic accuracy, our work reveals that the model preserves the ordinality of severity in its representation space, supporting the biological view that discrete clinical categories approximate an underlying continuous biological spectrum. Its attention-based explanations also connect model outputs to interpretable tissue evidence, enhancing clinical trust. Crucially, by harmonizing inter-observer variability, SydneyMTL provides a more reliable foundation for gastric cancer risk assessment, ensuring that premalignant changes are captured with greater consistency. More broadly, our findings reposition AI for gastritis from narrow detection toward scalable, evidence-based decision support that can standardize grading practices and reduce cognitive burden on the global pathology workforce.
Niggemeier, L.; Hoelscher, D. L.; Herkens, T. C.; Gilles, P.; Boor, P.; Buelow, R.
Show abstract
IntroductionKidney biopsy reports contain rich information that is clinically actionable and useful for research. However, the narrative format hinders scalable reuse. We here investigated whether open-source large language models (LLMs) can extract relevant, standardized readouts from native kidney biopsy pathology reports. MethodsGerman free-text native kidney biopsy reports were parsed with three open-source LLMs (Llama3 70B, Llama3 8B, MedGemma) to generate structured JSON outputs covering relevant report elements (e.g., diagnosis, glomerular counts, histopathological patterns). Two independent observers manually curated the same report elements; disagreements between the two were resolved by an experienced nephropathologist to create the final ground truth. Performance was assessed using strict and soft matching and summarized accuracy. Inter-rated agreement was quantified using Cohens and Lights Kappa with 95% confidence intervals via 1000-times bootstrapping. ResultsLlama3 70B achieved the highest overall accuracy (93.3% strict, 97.1% soft), followed by MedGemma. These larger models showed near perfect performance for explicit and discrete variables and positivity of immunohistochemistry markers, while accuracy decreased for report elements requiring interpretation (e.g., primary diagnosis, interstitial inflammation in fibrosis vs. non-fibrotic cortex). Human raters showed strong agreement for the primary diagnosis ({kappa} = 0.74, 95% CI 0.64-0.84). Adding Llama3 70B or MedGemma as a third rater increased overall agreement (0.82, 95% CI 0.74-0.89 and 0.78, 95% CI 0.69-0.85, respectively), whereas Llama3 8B reduced it. ConclusionsOpen-source LLMs can accurately transform narrative nephropathology reports into a structured and machine-readable format, potentially supporting scalable retrospective cohort building. While some report elements can be extracted without supervision, interpretation-dependent elements should be supervised by a human observer. Lay SummaryRetrospective data collection from nephropathology reports is essential for building informative cohorts in computational nephropathology research, yet manual processing of narrative reports is time-consuming and limits scalability. In this study, we demonstrate that open-source large language models can reliably extract key diagnostic, quantitative, and descriptive data elements from kidney biopsy reports with high accuracy. While factual and clearly stated report elements can be extracted automatically, findings that require contextual or interpretative judgment still benefit from expert supervision. Overall, this approach substantially reduces manual effort and enables efficient generation of structured datasets from diagnostic routine, facilitating the development of kidney registries and future computational nephropathology research. In addition, such systems could be implemented into the routine diagnostic workflow, to directly transform narrative reports into structured data.
Ayad, M. A.; McCortney, K.; Congivaram, H. T. S.; Hjerthen, M. G.; Steffens, A.; Zhang, H.; Youngblood, M. W.; Heimberger, A. B.; Chandler, J. P.; Jamshidi, P.; Ahrendsen, J. T.; Magill, S. T.; Raleigh, D. R.; Horbinski, C. M.; Cooper, L. A. D.
Show abstract
Meningiomas are the most common primary brain tumors and, despite their benign reputation, often behave aggressively. Meningiomas are morphologically heterogeneous, yet the full significance of their histologic diversity is unclear. This is in large part because many features are not readily quantifiable by traditional observer-based light microscopy. Molecular testing improves prognostic stratification, but is not universally accessible. We therefore sought to determine whether an artificial intelligence (AI)-trained program could predict specific genomic and epigenomic patterns in meningiomas, and whether it could extract more prognostic information out of standard hematoxylin and eosin (H&E) histopathology than the current WHO classification. To do this, we developed Morphologic Set Enrichment (MSE), an interpretable computational pathology framework that quantifies statistical enrichment of morphologic patterns, cells, and tissue architecture from H&E whole-slide images. The MSE meningioma histology program was able to accurately predict DNA methylation subtypes and concurrent chromosome 1p/22q losses, in the process identifying specific morphologic patterns associated with key genomic and epigenomic alterations. It also added prognostic value independent of standard clinical and pathological variables. These results demonstrate that AI-based quantitative morphologic profiling can capture clinically and biologically relevant information that redefines risk stratification for meningiomas, incorporating histological information not included in existing grading schemes.
Abolfathi, H.; Maranda-Robitaille, M.; Lamaze, F. C.; Kordahi, M.; Armero, V. S.; Orain, M.; Fiset, P. O.; Joubert, D.; Desmeules, P.; Gagne, A.; Yatabe, Y.; Bosse, Y.; Joubert, P.
Show abstract
BackgroundHistologic descriptors such as lymphovascular invasion (LVI), visceral pleural invasion (VPI), spread through air spaces (STAS), and grading system have each been associated with adverse outcomes in lung adenocarcinoma (LUAD). However, with the exception of VPI, these features are not formally incorporated into the TNM staging system. We evaluated the prognostic value and incremental contribution of these histologic descriptors within the framework of the 9th edition TNM staging system. MethodsIn total, 1,745 individuals diagnosed with stage I-III invasive non-mucinous lung adenocarcinoma (NM-LUAD) were included in this study, comprising 1139 French-Canadian patients who underwent surgical resection at IUCPQ-Universite Laval (discovery cohort) and 606 patients from the National Cancer Center Hospital in Tokyo, Japan (validation cohort). The objective of this study was to assess the prognostic contribution of histologic descriptors, including STAS, and LVI, as complements to conventional 9th edition TNM staging. ResultsGrade 3 tumors, LVI, and STAS were identified in 880 (50.4%), 809 (46.4%), and 775 (44.4%) of 1745 cases, respectively. Histologic grade and LVI demonstrated the strongest associations, particularly in early-stage disease, while STAS exhibited a stage-dependent effect, being more impactful in stages II-III. VPI showed less consistent prognostic value. Incorporating these histologic descriptors into TNM staging improved prognostic model performance, with the largest gains driven by histologic grade and LVI, while STAS provided additional, complementary prognostic refinement. ConclusionThese findings demonstrate that key histologic descriptors--including grading system, LVI, and STAS--represent robust and reproducible prognostic parameters. Importantly, these descriptors provide complementary, stage-dependent information that may enhance risk stratification and inform refinement of future TNM staging frameworks, including the forthcoming 10th edition.
Kaistha, A.; Situ, J. J.; Evans, S. C.; Ashton-Key, M.; Ogg, G.; Soilleux, E. J.
Show abstract
T-cell lymphomas are often histologically indistinguishable from benign T-cell infiltrates. Clonality testing is frequently required for diagnosis. It lacks the spatial context and is slow and expensive, relying on complex, multiplexed PCR reactions, interpreted by experienced scientists or pathologists. We previously published details of a pair of highly specific monoclonal antibodies against the two alternatively used, but very similar, T-cell receptor {beta} constant regions, TCR{beta}1 and TCR{beta}2. We demonstrated the feasibility of immunohistochemical detection of TCR{beta}1 and TCR{beta}2 in formalin-fixed, paraffin-embedded (FFPE) tissue as a novel diagnostic strategy for T-cell lymphomas. Here we validate an improved pairing of TCR{beta}1/2 rabbit monoclonal antibodies, and demonstrate their utility for single and double immunostaining, including with a chimeric mouse anti-TCR{beta}2 antibody. Finally, we show that this staining is amenable to automated cell counting, permitting accurate calculation of the TCR{beta}2:TCR{beta}1 ratio.
Jiang, B.; Zhang, Y.; Sheng, H.; Wang, Q.; Hu, B.; Wang, L.; Fu, J.
Show abstract
ObjectiveTo explore the application value of dual-staining for specific AT sequence binding protein 2 (SATB2) immunohistochemistry and elastic lamina in detecting elastic lamina invasion (ELI) in pT3 colon cancer, and to assess its association with clinicopathological characteristics, staging, and prognosis. MethodsThis retrospective cohort study enrolled 176 pT3 colon cancer patients who underwent radical resection at Affiliated Jinhua Hospital Zhejiang University School of Medicine. The deepest tumor-infiltrated paraffin blocks were collected for SATB2 immunohistochemistry and elastin dual-staining. Correlations between ELI status and clinicopathological characteristics and prognosis were analyzed. Survival data of 74 pT4a stage patients were collected for comparative analysis. ResultsELI (+) was positively associated with high tumor budding grade, vascular invasion, lymph node metastasis, and reduced tumor infiltrating lymphocytes (TILs) (all P < 0.001). No correlations were observed with age, gender, tumor location, histological subtype, tumor grade, or perineural invasion (all P > 0.05). The ELI (+) group exhibited significantly shorter disease-free survival (DFS) and overall survival (OS) compared to ELI (-) group (P < 0.05). Additionally, the ELI (+) group demonstrated inferior OS than the pT4a group, though DFS did not differ significantly. ConclusionDual-staining of SATB2 immunohistochemistry and elastic lamina provides a reproducible and objective method for assessing ELI. ELI correlates with key clinicopathological features and functions as an independent adverse prognostic indicator in pT3 colon cancer.
Gernand, A. D.; Walker, R.; Pan, Y.; Mehta, M.; Sincerbeaux, G.; Gallagher, K.; Bebell, L. M.; Ngonzi, J.; Catov, J. M.; Skvarca, L. B.; Wang, J. Z.; Goldstein, J. A.
Show abstract
BackgroundPlacental growth and function are imperative for healthy fetal growth; data on placentas can inform research and clinical care. Measuring placental size after delivery should be easy, but current methods are hard to standardize and error prone. We developed PlacentaVision using artificial intelligence (AI)-based models, to automatically, accurately, and precisely measure placentas from digital photographs. ObjectiveWe aimed to compare placental disc morphology between gross pathology examination (human measurements) and our automated PlacentaVision model (AI measurements). MethodsPlacentaVision is a multi-site study to assess placental morphology, features, and pathologies from digital photographs. We built a large dataset of digital placenta photographs and clinical data from singleton births at three large hospitals: Northwestern Memorial (Chicago; n=24,933), UPMC Magee-Womens (Pittsburgh; n=1198) and Mbarara Regional Referral (Uganda, n=1715). Data and images were from the medical record for Northwestern, part of a biobank study for Magee, and from our prospective studies for Mbarara. We compared long and short disc axis length (defined by Amsterdam criteria) between human and AI-based PlacentaVision measurements by calculating the difference and using Bland-Altman; we stratified by site, disc shape, infant sex, and term/preterm birth. ResultsMean (SD) disc length was 19.2 (3.1) and 18.6 (3.1) cm from PlacentaVision and human measurement, respectively, with a difference of 0.57 (2.19) cm. Disc width was 16.3 (2.3) cm and 16.1 (2.4) cm from PlacentaVision and human measurement, respectively, with a difference of 0.25 (1.85) cm. Bland-Altman limits of agreement were -3.7 to 4.9 cm for length and -3.4 to 3.9 cm for width. Irregularly-shaped placentas had a greater difference between PlacentaVision and human measurements compared to those with round/oval shapes (length differences of 1.53 and 0.45 cm respectively). Further, there were length differences by site (Northwestern 0.6, Magee 0.0, and Mbarara 0.4) and gestational age at birth (preterm 0.71, term 0.53 cm), but similar results for male and female placentas. Results for width were similar to length. ConclusionsAI-based measurements were less than a cm from human measurements overall. Our findings of larger differences for irregular shapes and preterm may indicate it is difficult for humans to measure irregular or small placentas according to protocol. PlacentaVision can automate and standardize the process.
Kibera, J.; Bender, J. B.; Kobia, F. M.; Kibaya, R.; Gitonga, M.; Gitonga, F.; Ondieki, F.; Killingo, B.; Kepha, S.; Achakolong, M.; Gelalcha, B.; Mahero, M.
Show abstract
BackgroundHepatocellular carcinoma (HCC) is a leading cause of cancer-related death in sub-Saharan Africa (SSA). Differentiating primary HCC from metastatic liver tumors remains a significant diagnostic challenge. Understanding the prevalence and clinical predictors of HCC is crucial for improving diagnosis and patient care. This study examined the prevalence of hepatitis B virus (HBV), hepatitis C virus (HCV), and HCC, and clinical predictors of HCC. MethodsWe used immunohistochemical markers on archived liver tumor biopsies and analyzed the data using descriptive and logistic regression analysis. ResultsAmong 58 liver carcinoma cases, 37.9% had HCC, and 62% had metastatic liver carcinoma (MLC). HCC was most common (61.5%) among middle-aged adults (50-59 years). HCC was more frequent in males (47.2%) than in females (22.7%). Over half of the patients (51.7%) tested positive for HBV. HCC was more prevalent in HBV-positive patients than HBV-negative ones (43.3% vs 32.1%). Hepatic fibrosis was identified in 27.6% of cases. HCC was more common in patients with fibrosis (56.2%) than in those without (31%). HCV infection was rare (6.9%) in this study. In multivariable logistic regression analysis, none of the examined predictors reached statistical significance (P>0.05). Patients aged 50-59 years, males, those with HBV infection, and hepatic fibrosis showed higher odds of HCC. Hepatocyte Paraffin-1 (Hep Par-1) demonstrated 97% specificity and a 95% positive predictive value (PPV) for differentiating HCC from MLC. The combined marker pattern of Hep Par-1 positive and AE1/AE3 negative was highly predictive of HCC (100% specificity, 100% PPV, and 93.2% diagnostic accuracy). ConclusionsOur findings indicate that while the assessed risk factors tend to show directional association with HCC, as expected, larger studies are needed to determine their independent effects. The combined Hep Par-1 AE1/AE3 immunophenotype is more accurate than either marker alone. Therefore, this combined test is a valuable diagnostic tool for confirming HCC in resource-limited settings.
Mittal, P.; Singh, D.; Chauhan, J.
Show abstract
We propose a lesion-centric phenotype learning pipeline for interpretable breast ultrasound (BUS). Predicted lesion masks are used for mask-weighted pooling of segmentation-encoder latents, producing compact embeddings that suppress background influence; a lightweight calibration step improves cross-dataset consistency. We cluster embeddings to discover latent phenotypes and relate phenotype structure to morphology descriptors (compactness, boundary sharpness). On BUSI and BUS-UCLM with external testing on BUS-BRA, lesion-centric pooling and calibration improve separability and enable strong malignancy probing (AUC 0.982), outperforming radiomics and a standard CNN baseline. A simple rule-gated generator further improves BI-RADS-style descriptor consistency on difficult cases.
Just, M. K.; Christensen, K. B.; Wirenfeldt, M.; Steiniche, T.; Parkkinen, L.; Myllykangas, L.; Borghammer, P.
Show abstract
ObjectiveBrain branks preserve extensive material relevant to neurodegenerative disease research. As these collections age, tissue becomes archival, raising the question of whether long-term fixed and stored human brain tissue remains suitable for contemporary immunohistochemical analyses. Materials and MethodsForty-one autopsy brains collected between 1946 to 1980 were examined. For each case, midbrain and hippocampus were available both as original paraffin-embedded blocks and as tissue stored long term in fixative. New paraffin blocks were prepared from the long-term fixated tissue. Sections from original and newly prepared blocks were immunohistochemically stained for -synuclein, hyperphosphorylated tau and amyloid-{beta}. Immunoreactivity was assessed using semi-quantitative scoring. ResultsOriginal blocks consistently showed good staining intensity and morphological preservation for each protein pathology. Newly prepared blocks showed slightly lower semi-quantitative scores for Lewy-related pathology, without statistically significant differences, except for astrocytic -synuclein in the substantia nigra in cases from the 1960s. Tau pathology displayed modestly reduced labelling, particularly of the neuropil threads and neurofibrillary tangles, most evident in cases from the 1950s. Amyloid-{beta}-positive senile plaques showed similar or slightly higher scores in newly prepared blocks, with no significant differences across regions. ConclusionHuman brain tissue preserved as paraffin-embedded blocks or stored in fixative for up to 78 years remains suitable for immunohistochemical analyses. Adequate-to-good detection of aggregated of -synuclein, hyperphosphorylated tau and amyloid-{beta} is achievable, indicating preserved pathological hallmarks of Lewy Body Disease and Alzheimers Disease in archival tissue.
Heysmond, S.; Kyratzi, P.; Wattis, J.; Paldi, A.; Brookes, K.; Kreft, K. L.; Shao, B.; Rauch, C.
Show abstract
Background: Quantitative genome wide association studies (GWAS) primarily rely on additive linear models that compare average phenotypic differences between genotype groups. While effective for detecting common variants of moderate effect in large sample sizes, such approaches inherently reduce high resolution phenotypic data to summary statistics (group averages), potentially limiting the detection of subtle genotype phenotype relationships. Genomic Informational Field Theory (GIFT) is a recently developed methodology that preserves the fine-grained informational structure of quantitative traits by analysing ranked phenotypic configurations rather than relying solely on mean differences. Methods: We applied GIFT to genetic and neuropathological data from the Brains for Dementia Research cohort, a well characterised dataset of 563 individuals, and compared its performance with conventional GWAS. Principal component analysis (PCA) derived matrix was used to derive independent quantitative traits linked to from Alzheimer disease (AD) neuropathology measures (CERAD, Thal, Braak staging), with and without inclusion of age at death. Principal component analyses were performed using GWAS and GIFT frameworks on the same filtered genotype dataset. Results: Both GWAS and GIFT identified genome-wide significant associations (pvalue<0.000001) within the APOE locus (NECTIN2/TOMM40/APOE/APOC1), demonstrating concordance with established AD genetic variants. However, GIFT detected additional significant 19 SNPs beyond those identified by GWAS. Variants associated with AD pathology implicated genes involved in amyloid processing, neuronal apoptosis, synaptic function, neuroinflammation, and metabolic regulation. Notably, GIFT identified 29 loci associated with age at death related variation that were not detected by GWAS, highlighting genes linked to lipophagy, mitochondrial quality control, sphingolipid metabolism, frailty, and aging-related processes. Conclusions: GIFT recapitulates canonical GWAS findings while uncovering additional biologically relevant associations. By preserving the fine-grained structure of phenotypic data distributions and detecting non random genotype segregation across ranked trait values, GIFT enables the identification of associations that remained undetected by traditional average based GWAS approaches. These results demonstrate that rethinking analytical representation, rather than solely increasing sample size, can expand discovery potential of genetic association studies, offering a transparent and complementary framework for quantitative genomics in deeply phenotyped datasets.
Qu, B.; Liu, W.; Zhou, L.; Guo, X.; Malin, B.; Yin, Z.
Show abstract
Dense breast tissue diminishes the sensitivity of mammographic screening and is a key cancer risk factor, which motivates accurate segmentation under scarce and expensive expert annotations in the medical imaging domain. Here, we benchmark the effect of backbone architecture, self-supervised pre-training (SSL), fine-tuning strategy, and loss design for dense-tissue segmentation on a small expert-labeled dataset (596 images) and an in-domain unlabeled corpus (20, 000 images), reflecting the lack of large public pixel-level density datasets. CNNs (EfficientNet, Xception, nnUNet) clearly outperform transformer and Medical-SAM2 models, and full or layer-wise fine-tuning reliably exceeds parameter-efficient updates. Generic image-only SSL (MIM, SimCLR, Barlow Twins) often yields negligible or negative gains over ImageNet initialization, whereas a simple multi-view contrastive SSL and a hybrid segmentation-density loss provide the best accuracy and calibration (e.g., MAE from 14.8% to 11.8%, Spearman with the four BI-RADS breast density categories from 0.42 to 0.51 on VinDr). We also quantify GPU hours for different SSL and fine-tuning choices, showing that only a small set of protocols, such as EfficientNet with multi-view SSL, hybrid loss, and full fine-tuning, offers favorable accuracy-efficiency trade-offs. These findings provide practical defaults for annotation-limited mammography studies and support compute-conscious deployment of automatic breast density assessment in web-based screening workflows.
Kohn, T. P.; Coady, P. J.; Oppenheimer, A. G.; Walia, A.; Hernadez, B. S.; Kohn, J. R.; Parikh, N.; Bazzi, M.; Stocks, B.; Khera, M.; Lipshultz, L. I.
Show abstract
IntroductionNon-obstructive azoospermia (NOA) represents the most severe form of male infertility. Current clinical tools have limited ability to predict sperm production or guide surgical sperm retrieval. Conventional B-mode ultrasound provides qualitative grayscale images and cannot characterize testicular microstructure relevant to spermatogenesis. Quantitative ultrasound (QUS) provides objective parameters from raw radiofrequency data, which quantitatively measure tissue heterogeneity. We hypothesize that men with spermatogenesis will have different QUS features compared to men without spermatogenesis (measured by total motile count, TMC, on semen analysis), with the goal of identifying imaging biomarkers for prognosis and intraoperative guidance. MethodsWe prospectively analyzed men presenting for infertility evaluation who underwent high-frequency ultrasound imaging and semen analysis. Imaging was performed using a 36-MHz transducer with fixed acquisition parameters. Ninety-two QUS features were extracted from manually annotated testicular regions of interest, including Nakagami distribution parameters (m, {omega}, k), envelope statistics, and texture features. Univariate associations between each QUS feature and TMC were assessed using Spearman correlation with Bonferroni correction. Top-performing features were evaluated using logistic regression and receiver operating characteristic (ROC) analysis to discriminate sperm presence or absence (TMC>0 vs TMC=0). ResultsThirty-seven men (18 azoospermic, 19 with sperm present in the ejaculate) contributed 135 regions of interest. Seventeen of 92 QUS features significantly correlated with TMC after correction. The coefficient of variation of the Nakagami k-factor within the superficial testicular parenchyma (K_Zone1_Cv) demonstrated the strongest correlation ({rho}=0.51, corrected p<0.001), suggesting that greater spatial heterogeneity in the superficial parenchyma was associated with higher sperm counts. K_Zone1_Cv discriminated sperm presence with an AUC of 0.77 (95% CI 0.60-0.92), sensitivity 73.7%, and specificity 83.3%. QUS features with the highest univariate association were highly intercorrelated, suggesting a shared biological signal. ConclusionQuantitative ultrasound-derived measures of testicular microstructure heterogeneity correlate with sperm production and demonstrate moderate discrimination of sperm presence. These findings suggest QUS may provide a non-invasive imaging biomarker of spermatogenesis. Study findings warrant further assessment and validation in male infertility for sperm retrieval prognosis and the potential for intra-operative surgical guidance.
Sahin, S.; Diaz, E.; Rajagopal, A.; Abtahi, M.; Jones, S.; Dai, Q.; Kramer, S.; Wang, Z.; Larson, P. E. Z.
Show abstract
Current standard of care imaging practices cannot reliably differentiate among certain renal tumors such as benign oncocytoma and clear cell renal cell carcinoma (RCC), and between low and high grade RCCs. Previous work has explored using deep learning, radiomics, and texture analysis to predict renal tumor subtypes and differentiate between low and high grade RCCs with mixed success. To further this work, large diverse datasets are needed to improve model performance and provide strong evaluation sets. In this work, a dataset of 831 multi-phase 3D CT exams was curated. Each exam contains up to three contrast-enhanced CT phases. Tumor outlines or bounding boxes were annotated and registered to the image volumes. The pathology results for each tumor and relevant patient metadata are also included.
Cassim, N.; Stevens, W. S.; Glencross, D. K.; Coerzee, L.-M.
Show abstract
BackgroundIn 2004, South Africas public health system faced the dual challenge of rapidly scaling up antiretroviral therapy (ART) while reducing the cost of laboratory monitoring. At the time, conventional CD4 testing methods were expensive, labour-intensive, and impractical for sustaining a national testing network. This study aimed to assess the financial impact and cost savings associated with the implementation of the PanLeucogated CD4 (PLG/CD4) enumeration method between 2004 and 2024 in the public-sector in South Africa. MethodsA longitudinal cost analysis was conducted using annual test volumes and state tariffs for PLG/CD4 testing and the 4-colour CD3/CD4/CD8/CD45 T-cell enumeration reference method. Annual cost savings were calculated in United States Dollars (USD) by applying historical South African Rands (ZAR) to United States Dollars (USD) exchange rates. The state prices for tariff codes PLG/CD4 and the reference method were provided by calendar year in ZAR and converted to USD based on the prevailing exchange rate. The USD test prices were multiplied by annual test volumes. Cost savings were calculated by multiplying annual test volumes and the difference in test prices in USD (difference between PLG/CD4 and the reference method). ResultsThere were 50,745,848 PLG/CD4 tests performed over 20-years. The cost-per-test of PLG/CD4 was consistently lower than the reference method, ranging from $4,06 to $9,40, compared to $13,06 to $28,21. Cumulative national savings amounted to USD 626 million. The peak annual savings of $64,6 million occurred in 2011, coinciding with the height of ART enrolment. Cost savings persisted despite a doubling in the exchange rate over the study period. ConclusionThe PLG/CD4 implementation enabled cost-efficient, scalable, quality-assured CD4 testing as part of the national HIV response, reducing reliance on complex/costly technologies while improving coverage. These findings support the critical role of context-specific diagnostic innovation to strengthen health system resilience.
Wu, J.; Perandini, L.; Batra, T.; Igoshin, S.; Bari, S.; de Araujo, A. L.; Willemink, M. J.
Show abstract
Digital breast tomosynthesis (DBT) is a powerful imaging modality that allows for improved lesion visibility, characterization, and localization compared to conventional two-dimensional digital mammography. DBT has been increasingly adopted in screening and diagnostic settings globally, particularly for women with dense breast tissue where tissue overlap presents a significant diagnostic challenge. Here we describe DBT-2026, a real world imaging dataset with 558 DBT exams from 558 patients with breast imaging reporting and data system (BI-RADS) scores of 0, 1, or 2. Each case contains one DBT examination in combination with expert annotations and free-text radiology reports that describe the radiological findings, produced in routine clinical practice. To protect patient privacy, all images and reports have been de-identified. The dataset is made freely available to researchers for non-commercial projects to facilitate and encourage research in breast cancer imaging.
Anderson, O.; Hung, R.; Fisher, S.; Weir, A.; Voisey, J. P.
Show abstract
Radiogenomics enables the non-invasive characterisation of the genomic and molecular properties of tumours, with epidermal growth factor receptor (EGFR) mutations in non-small cell lung cancer (NSCLC) being one of the most investigated applications. In this study, we evaluate radiomics, contrastive learning, and convolutional deep learning approaches to predict the EGFR mutation status from chest Computed Tomography (CT) images using the TCIA Radiogenomics dataset (n=115). Our results, using 10-fold cross validation, demonstrate the capacity of imaging models to predict mutation status from CT data in a manner consistent with existing literature. Among the evaluated methods, models integrating radiomic with clinical features achieved the best performance, with an AUC of 0.790 and AUPRC of 0.517, outperforming both contrastive learning (AUC=0.787) and convolutional architectures (AUC=0.763). Beyond methodological comparisons, we discuss the challenges related to clinical translation. Specifically, we contrast radiogenomics with conventional tissue biopsies, and identify scenarios where radiogenomics might be useful, either independently or in conjunction with other existing diagnostic technologies. Together these findings evidence the potential utility of radiogenomics EGFR models and provide direct architecture comparisons on the same dataset.
Tong, T.; Zhang, W.; Zu, W.
Show abstract
Accurate polyp segmentation from colonoscopy images is critical for colorectal cancer prevention, yet the generalization of deep learning models under domain shift remains insufficiently explored. We propose Boundary-Explicit Guided Attention U-Net (BEGA-UNet), a boundary-aware segmentation architecture that introduces explicit edge modeling as a structural inductive bias to enhance both segmentation accuracy and cross-domain robustness. The framework integrates three components: an Edge-Guided Module (EGM) with learnable Sobel-initialized operators to capture boundary cues, a Dual-Path Attention (DPA) module that processes channel and spatial attention in parallel, and a Multi-Scale Feature Aggregation (MSFA) module to encode contextual information across multiple receptive fields. Evaluated on the combined Kvasir-SEG and CVC-ClinicDB benchmarks, BEGA-UNet achieves 88.53% Dice and 82.51% IoU, outperforming representative convolutional and transformer-based baselines. More importantly, cross-dataset evaluation demonstrates strong robustness under domain shift, with BEGA-UNet retaining 83.2% of its in-distribution performance-substantially higher than U-Net (64.5%), Attention U-Net (47.5%), and TransUNet (53.1%). In a zero-shot setting on an entirely unseen dataset, the model further maintains 72.6% performance retention. Comprehensive ablation studies indicate that explicit boundary modeling plays a central role in improving generalization, while multi-scale context aggregation further stabilizes performance across domains. Feature distribution analyses support this observation by showing that edge-oriented representations exhibit markedly reduced cross-domain variability compared to appearance-driven features. Overall, BEGA-UNet provides an effective and interpretable solution for robust polyp segmentation, demonstrating that explicit boundary modeling serves as a critical inductive bias for ensuring reliability under clinical domain shifts.